Learning DevOps

The DevOps Culture and Infrastructure as Code practices

DevOps stands for Development and Operations. It’s a term often associated with the practices of Continuous Integration (CI) and Continuous Development (CD) and with Infrastructure as a Code (IaaC).

Getting started with DevOps

The term was introduced in 2007-2009, and the term is used to refer to the practices that aim to reduce the barrier between developers, who innovate and deliver, and operations, who want to guarantee stability.

DevOps is an extension of agile processes. The collaboration between Dev and Ops is achieved by:

More frequent application development with CI and CD.
The implementation and automation of unit and integration tests, which can be behavior-driven design (BDD) or test-driven design (TDD).
Collecting feedback from users.
Monitoring applications and infrastructure.
Collaboration.
Processes: The DevOps process is divided into several phases that are repeated cyclically:
- Planning and prioritizing functionalities.
- Development.
- Continuous integration and delivery.
- Continuous deployment.
- Continuous monitoring.
Tools: Choosing the right tools for the job can close the communication gaps. Developers need to use the Ops tools to detect performance problems as soon as possible, and Operations must automate the process of creating and updating the infrastructure. Furthermore, the DevOps culture can be divided into three axes: the collaboration, the process, and the tools.

Donovan Brown’s definition of DevOps is:

“DevOps is the union of people, processes, and products to enable continuous delivery of value to our end users.”

The benefits, within an enterprise, of a DevOps culture are:

Better collaboration and communication in teams.
Shorter production times, and thus better performance and end user satisfaction.
Reduced infrastructure costs, thanks to IaC.
Less time wasted, thanks to iterative cycles, which reduce application errors, and automation tools, which reduce manual tasks.

Implementing CI/CD and continuous deployment

Continuous Integration

The definition of CI, by Martin Fowler, is:

“Continuous integration is a software development practice where members of a team integrate their work frequently… Each integration is verified by an automated build (including test) to detect integration errors as quickly as possible.”

So CI is and automatic process that check that the application’s code is complete and correct every time a member makes a change.

Implementing CI

To set up CI, it is necessary to have a Source Code Manager (SCM) that will centralize the code of all members, this manager can be of any types (e.g., git, SVN, etc.). It’s also important to have an automatic build manager that supports continuous integration (e.g., Jenkins, GitHub Actions, etc.).

Each team member will work on the application code daily. Then, several times a day, each team member will archive or commit their code, preferably in small commits to easily fix errors. The commits will be integrated into the rest of the code, along with other members’ commits, thanks to the CI process.

The CI server, which will execute the CI process, needs to be automated and triggered by each commit. After retrieving the code, the server will:

Build the application package.
Perform unit tests, and calculate code coverage. Code that is archived will not be processed by the CI process. Deactivating a test execution must be done if and only if it is necessary to deliver quickly or if the code added in the commit is not essential to the application. CI, however, cannot catch all errors, especially those that happens in production. Therefore, the time needed to fix production errors is taken from the time saved by the CI process.

Continuous Delivery

Continuous Delivery comes after the CI process is passed, and its process is to deploy the application automatically in non-production environments (staging).

The process starts from the application package, built by CI, which will be installed through automated tasks. During this phase, it is also possible to execute functional and acceptance tests.

Unlike CI, in CD the application is tested with all of its dependencies. If the application in question is a microservice application, then:

CI will test the single microservice in development.
CD will test and validate the entire application, as well as the APIs.

In practice, CI and CD are linked in an integrated environment, so that the developer can execute unit tests and test the whole application. It’s important that the package generated by CI and the package installed in all the environments must be the same; however, configuration files can differ depending on the environment.

The tools necessary for CI/CD are:

A package manager.
A configuration manager: To manage configuration changes in CD. The deployment of the application in each staging environment can be triggered:
Automatically: After the successful execution in the previous stage.
Manually: In case of a sensitive environment, such as the production one, that may require manual approval from a person responsible for validating the application.

Continuous deployment

It’s an extension of CD, it consists of a process that starts with the developer’s commit and ends with the deployment in production of the change.

This practice is rarely implemented because it requires a variety of tests to guarantee that the application works. In addition, the continuous deployment process must take into account all the steps to restore the application in the event of a production problem.

It is used for:

Toggling features: It permits toggling application’s functionalities without the need to redeploy the application.
Blue-green production infrastructure: This infrastructure ensures no downtime during deployment. There are 2 environments, one green and one blue; first you deploy to the blue and then to the green.

The diagram is the same as CD, but with the difference that it has an automated end-to-end deployment.

Understanding IaC practices

It’s a practice that consists of writing the code for the resources that make up an infrastructure.

This practice is widely used since:

Deploying infrastructures manually takes a lot of time and there can many manual errors.
Scalability is important in cloud computing.

Benefits of IaC

Having a standard infrastructure reduces errors.
The code that generates the infrastructure can be versioned.
The deployment of the infrastructure is faster, thanks to the integration into the CI/CD pipeline.
Reduced costs, management, and control.

IaC languages and tools

The languages and tools can be:

Scripting: This category includes tools such as Bash, PowerShell and clients provided by the cloud provider. The problem with these types of tools is that they require a lot of lines of code.
Declarative: These tools allow to define an infrastructure by writing its configuration and properties in a file. Examples are Terraform, Vagrant and Ansible.
Programmatic: The infrastructure is programmed with a declarative language, similar to the one used by developers. Examples are Pulumi and Terraform CDK.

The IaC topology

There are various IaC typologies:

Deploying and provisioning the infrastructure: Where you instantiate the resources that make up the infrastructure. They can be of the Platform-as-a-Service (PaaS) and the serverless resource types but also the entire network.
Server configuration and templating: Where you configure the virtual machines. To optimize the process, it is possible to use server models, called images.
Containerization: An alternative to deploying applications on VMs. The most used technology is Docker and the containers are configured with a Dockerfile.
Configuration and deployment in Kubernetes: Kubernetes is a container orchestrator that deploys containers, manages the network architecture, and handles volume management. It is configured with YAML files.

Provisioning Cloud Infrastructure with Terraform

Terraform is one of the most popular tools for IaC. In this chapter we will be using Terraform with Azure.

An Azure subscription and a code editor is needed for this chapter.

Installing Terraform

Manual installation

Reach the download page.
Unzip and copy the binary into an execution directory.
Add that directory to the PATH environment variable.

Installation by script on Linux

The installation on Linux can be done via script or via the apt package manager:

Script installation:

TERRAFORM_VERSION="1.0.0" #Update with your desired version
curl -Os https://releases.hashicorp.com/terraform/${TERRAFORM_VERSION}/terraform_${TERRAFORM_VERSION}_linux_amd64.zip 
&& curl -Os https://releases.hashicorp.com/terraform/${TERRAFORM_VERSION}/terraform_${TERRAFORM_VERSION}_SHA256SUMS 
&& curl https://keybase.io/hashicorp/pgp_keys.asc | gpg --import 
&& curl -Os https://releases.hashicorp.com/terraform/${TERRAFORM_VERSION}/terraform_${TERRAFORM_VERSION}_SHA256SUMS.sig 
&& gpg --verify terraform_${TERRAFORM_VERSION}_SHA256SUMS.sig terraform_${TERRAFORM_VERSION}_SHA256SUMS 
&& shasum -a 256 -c terraform_${TERRAFORM_VERSION}_SHA256SUMS 2>&1 | grep "${TERRAFORM_VERSION}_linux_amd64.zip:sOK" 
&& unzip -o terraform_${TERRAFORM_VERSION}_linux_amd64.zip -d /usr/local/bin

With the apt package manager:

sudo apt-get update && sudo apt-get install -y gnupg software-properties-common curl 
&& curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add - 
&& sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main" 
&& sudo apt-get update && sudo apt-get install terraform

Integrating Terraform with Azure Cloud Shell

Terraform is integrated into the Azure Cloud Shell. The steps to enter the shells are:

Log in into the Azure portal.
Open the Cloud Shell and choose its mode: It’s either Bash or PowerShell.
Run terraform from the command shell.

Configuring Terraform for Azure

To provision in a cloud infrastructure like Azure, we must first configure Terraform to allow the manipulation of resources in an Azure subscription.

To do this, we will create a new Azure Service Principal (SP) in Azure Active Directory (AD), an application user who has permission to manage Azure resources.

Creating the Azure SP

This step can be done from the portal or via script with the az cli command:

az ad sp create-for-rbac --name="<ServicePrincipal name>" --role="Contributor" --scopes="/subscriptions/<subscriptionId>"

This command will return three things:

The application ID.
The client secret.
The tenant ID.

The SP is created in Azure AD:

Configuring the Terraform provider

Now, we will configure our Terraform configuration to connect to Azure using this SP:

In a directory of your choice, create the file provider.tf (.tf is the extension for Terraform files) and paste the code:

provider "azurerm" {
features {}
  subscription_id = "<subscription ID>"
  client_id = "<Client ID>"
  client_secret = "<Client Secret>"
  tenant_id = "<Tenant Id>"
}

Since it is not advisable to put identification information in plain text, we will improve the preceding code with:
```
provider "azurerm" {
  features {}
}
```
And we will pass the identification information through environment variables:
- ARM_SUBSCRIPTION_ID.
- ARM_CLIENT_ID.
- ARM_CLIENT_SECRET.
- ARM_TENANT_ID.

The Terraform configuration for local development and testing

To test the Terraform code quickly, it is possible to use your own Azure account. To do this, type az login before executing the code.

If there are several subscription you can use the command: az account set --subscription="<Subscription ID>"

Writing a Terraform script to deploy an Azure infrastructure

We will provide a simple Azure architecture with Terraform that is composed of the components:

Azure resource group.
Network configuration: Composed of virtual network and subnet.
In this subnet, we will create a VM with a public IP address. This code will be placed in main.tf, in the same directory as provider.tf.

For the resource group:

resource "azurerm_resource_group" "rg" {
	name = "bookRg"
	location = "West Europe"
	tags {
		environment = "Terraform Azure"
	}
}

Any piece of Terraform code is composed of:

A type of resource or data block.
A name of the resource to be managed (in this case azurerm_resource_group).
An internal Terraform ID (in this case rg).
A list of properties of the resource.

For the network interface part:

resource "azurerm_virtual_network" "vnet" {
	name = "book-vnet"
	location = "West Europe"
	address_space = ["10.0.0.0/16"]
	resource_group_name = azurerm_resource_group.rg.name
}
resource "azurerm_subnet" "subnet" {
	name = "book-subnet"
	virtual_network_name = azurerm_virtual_network.vnet.name
	resource_group_name = azurerm_resource_group.rg.name
	address_prefix = "10.0.10.0/24"
}

In this code we will create a VNet, book-vnet, and a subnet, book-subnet. We can also see that for the IDs we use pointers on the Terraform resources.

For provisioning the virtual machine we will need:

A network interface:

resource "azurerm_network_interface" "nic" {
name = "book-nic"
location = "West Europe"
resource_group_name = azurerm_resource_group.rg.name
ip_configuration {
	name = "bookipconfig"
	subnet_id = azurerm_subnet.subnet.id
	private_ip_address_allocation = "Dynamic"
	public_ip_address_id = "azurerm_public_ip.pip.id"
}
}

A public IP address:

resource "azurerm_public_ip" "pip" {
  name = "book-ip"
  location = "West Europe"
  resource_group_name = "${azurerm_resource_group.rg.name}"
  public_ip_address_allocation = "Dynamic"
  domain_name_label = "bookdevops"
}

An Azure Storage object for the diagnostics:

resource "azurerm_storage_account" "stor" {
  name = "bookstor"
  location = "West Europe"
  resource_group_name = azurerm_resource_group.rg.name
  account_tier = "Standard"
  account_replicant_type = "LRS"
}

The type of storage, in our case, is Standard LRS.

A virtual machine: We will be using an Ubuntu virtual machine:

resource "azurerm_linux_virtual_machine" "vm" {
  name = "bookvm"
  location = "West Europe"
  resource_group_name = azurerm_resource_group.rg.name
  vm_size = "Standard_DS1_v2"
  network_interface_ids = ["${azurerm_network_interface.nic.id}"]
  storage_image_reference {
	  publisher = "Canonical"
	  offer = "UbuntuServer"
	  sku = "16.04-LTS"
	  version = "latest"
  }
  ...
}

The complete source code is available here.

Following some Terraform good practices

Some good practices for writing good Terraform code are:

Separate files: Since every file with the .tf extension in the execution directory will be automatically executed, it is good to separate the code into several files to improve readability:
- Rg.tf: For the resource group.
- Network.tf: For the VNet and subnet.
- Compute.tf: For the network interface, public IP, storage, and VM.
Protection of sensitive data: To store sensitive data, such as passwords, it is possible to use Azure Key Vault or HashiCorp Vault. You can then retrieve them via Terraform.
Configuration with variables and interpolation functions: Often the infrastructure that will host the application will be the same for all stages. However, some configuration may change from one stage to another. To make the code more flexible we can add variables with the following step:
1. Declare the variables by adding the following code in the global Terraform code, or in a separate file variables.tf:
```
variable "resource_group_name" {
  description = "Name of the resource group"
}
variable "location" {
  description = "Location of the resource"
  default = "West Europe"
}
variable "application_name" {
  description = "Name of the application"
}
```
1. Initiate the values in a .tfvars file named terraform.tfvars, with the format variable_name = value.
2. Use the variables in the code with the format var.<name of the variables>. For example:
```
resource "azurerm_resource_group" "rg" {
	name = var.resource_group_name
	location = var.location
	tags {
		environment = "Terraform Azure"
	}
}
```
In addition, it is possible to use built-in functions that can be used to manipulate data or variables.

Running Terraform for deployment

Now that the configuration is written we can run Terraform and deploy our infrastructure.

However, it is first necessary to provide authentication with the Azure SP to ensure that Terraform can manage the Azure resources. We can do this in two ways:

Configuring manually the environment variables needed for Terraform:

export ARM_SUBSCRIPTION_ID=xxxxx-xxxxx-xxxx-xxxx
export ARM_CLIENT_ID=xxxxx-xxxxx-xxxx-xxxx
export ARM_CLIENT_SECRET=xxxxxxxxxxxxxxxxxx
export ARM_TENANT_ID=xxxxx-xxxxx-xxxx-xxxx

Use the az cli with the login command.

First, check that we have an empty Azure subscription without any Azure resource group:

Initialization

The initialization step does the following:

Initialize the Terraform context and make the connection between the Terraform provider and the remote service, in this case Azure.
Download the plugins of the providers, in this case azurerm.
Check the code variables.

To execute the initialization, run the command:

terraform init

A .terraform directory will also be created.

Previewing the changes

With the plan command it is possible to preview the changes made to the infrastructure before applying them.

Applying the changes

After previewing the changes, we can apply the changes to our infrastructure with the command apply.

The Azure resources will look like the following:

Understanding the Terraform life cycle with different command-line options

As you might have figured out, applying changes to an infrastructure with Terraform involves three commands: init, plan and apply. But there are more commands available.

Using destroy to better rebuild

To destroy infrastructure previously built with Terraform you need to enter the command terraform destroy. This command will destroy only the resources configured in the current Terraform configuration. However, if the Terraform code provides a resource group, it will destroy all of its content.

Formatting and validating the configuration

The command terraform fmt is used to reformat the code in the .tf files.

To detect possible errors before running plan or apply, you can run the command terraform validate

The Terraform life cycle within a CI/CD process

When using Terraform locally, the execution life cycle is as follows:

IaC, as applications, must be deployed or executed in an automatic CI/CD process:

This is done with slight modifications in the:

plan command: That will look like terraform plan -out=out.tfplan.
apply command: That will look like terraform apply --auto-approve out.tfplan. The option --auto-approve is also available for the destroy command.

Protecting the state file with a remote backend

When the command apply is executed for the first time, Terraform will create a terraform.tfstate file that contains a JSON representation of the resource properties. This file is really important and must be protected because:

It contains the status of the infrastructure: Without it, Terraform might not behave as expected, since this file will be used to compare the changes in the resources with the command plan.
It must be accessible, at the same time, by only the team members.
It may contain sensitive data.
When using multiple environments, it is necessary to be able to use multiple state files.

To solve this problem, we will store this file in a remote backend. In Azure, we will use the azurerm remote backend. To do this we will:

Create a storage account: through the portal or by az cli:

# Create Resource Group
az group --name MyRgRemoteBackend --location westeurope
# Create storage account
az storage account create --resource-group MyRgRemoteBackend --name storageremotetf --sku Standard_LRS --encryption-services blob
# Get the key
ACCOUNT_KEY=$(az storage account keys list --resource-group MyRgRemoteBackend --account-name storageremotetf --query [0].value -o tsv)
# Create blob container
az storage container create --name tfbackends --account-name storageremotetf --account-key $ACCOUNT_KEY

Write the Terraform configuration: We configure Terraform to use the previously created remote backend:
```
terraform {
   backend "azurerm" {
	   storage_account_name = "storageremotetfdemo"
	   container_name = "tfbackends"
	   key = "myappli.tfstate"
	   snapshot  = true
   }
}
```
To pass the key value to Terraform, we need to set an ARM_STORAGE_KEY environment variable with the storage account access key value.
Now, the Terraform can be run with the new remote backend.

If multiple Terraform states are used to manage multiple environments, it’s possible to create several remote backend configurations with the code:

terraform {
	backend "azurerm" {}
}

And then create several backend.tfvars files that contain the properties of the backend:

storage_account_name = "storageremotetf"
container_name = "tfbackends"
key = "myappli.tfstate"
snapshot = true

To specify the backend in the init command we write:

terraform init -backend-config="backend.tfvars"

Using Ansible for Configuring IaaS Infrastructure

Now that the infrastructure is provisioned, thanks to Terraform, it is necessary to configure the system and install all the necessary middleware. There are several Infrastructure as Code tools available. Ansible, from Red Hat, stands out for its many assets:

Uses YAML language.
Works with one executable.
Doesn’t require agents on the VMs: It requires only a WinRM connection, for Windows VMs, or an SSH connection, for Linux VMs.
Has a template engine and a vault to encrypt/decrypt data.
Is idempotent.

Ansible can also be used for infrastructure provisioning, like Terraform, but with YAML configuration.

In this chapter, Ansible will be used to configure a VM with an inventory and a playbook. Technical requirements for this chapter are:

A Linux OS.
Python 2 or 3.
Azure Python SDK: Since in the last section we will run the Ansible dynamic inventory for Azure. The complete source code of this chapter is available here.

Installing Ansible

WIP